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MATHEMATICAL ANALYSIS OF A MULTIPLE -LOOK CONCEPT IDENTIFICATION MODEL 



John W. Cotton 

University of California , Santa Barbara 
Abstract 

The behavior of focus samples central to the multiple-look model of 
Trabasso and Bower (1968) is examined by three methods. First, exact 
probabilities of success conditional upon a certain brief history of 
stimulation are determined. Second, possible states of the organism during 
the experiment are defined and a transition matrix for those states deter- 
mined, permitting prediction over all possible numbers of trials. Third, 
Fisher’s generalizations and corrections of the Trabasso and Bower focus 
sample theory are examined. A general solution for the conditional 
probability of success is derived from Fisher's equation for the proba- 
bility of n successes between any two errors. One very strong implica- 
tion of the theory is given in Section 5 - 
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MATHEMATICAL ANALYSIS OF A MULTIPLE -LOOK CONCEPT IDENTIFICATION MODEL 



1. Introduction 

Consider a K -dimensional binary response concept identification task 
with one or more dimensions being relevant. A possible solution of the 
task might be that a stimulus including value 1 of dimension A (A^) should 
be followed by Response 1 (p^) and that Ag should be followed by Rg . 

Thus A is a relevant dimension. We require that with more than one 
relevant dimension all such dimensions give redundant information. Thus, 
in addition to our assumption about dimension A , we might assume that 
must be followed by R^ , Bg must be followed by Rg , and that presenta- 
tion of A^ (Ag)' v always implies presentation of B^ (Bg) . Thus B is 
also a relevant dimension. Trabasso and Bower (1968, pp. 5 ^- 57 ) present a 
model for a focus sample of x relevant and s - x irrelevant cues to 
which a person may attend on any trial. The focus sample is a crucial part 
of a multiple look model because it permits the learner to attend to more 
than one conceivably crucial cue on any one trial, with a subsequent reduc- 
tion in the number thus noted as new trials give new information. 

Trabasso and Bower (1968, p. 5 ^-) note that a random sequence of stimulus 
patterns implies that "each irrelevant cue will have an independent probabil- 
ity p on each trial. of being allied with the correct, relevant one." 
Furthermore, "the probability of the correct response is the proportion of 
cues in the focus sample that dictate that response." A cue is a dimension 
value, not a dimension. 

Trabasso and Bower begin their derivation by assuming an error on an 
arbitrarily numbered trial, Tq . At this point the learner selects s cues, 
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the set of such elements, with the convention that we list dimension values 
which evoke R^, opposite values evoking R g . Thus, if on T Q 
should have led to R^ , {A^B^C^} is an acceptable set (s} , for s = 3 . 

Trabasso and Bower also permit focus samples in which the same cue appears 
more than once. For example, (s) = (B^,B^, C^) is also acceptable in this 



On the next trial, T., , Trabasso and Bower predict the following pro- 
portion of successes: 



because x plus p(s - x) is the expected number of cues yielding a correct 
response, and x + (s - x) = s is the total number of cues from which selec- 
tion is being made. On subsequent trials in a series of successful trials, 
any cue which would not have led to a correct response on the immediately 
previous trial is excluded from the focus sample. The expected number of 
cues remaining in the focus sample becomes the denominator of a new predictive 
equation; the expected number of cues which would yield a correct response on 
the next trial becomes the numerator of that equation. Therefore the follow- 



ing probability is assigned for the n + 1 -th success conditional upon n 
successes in a row following Tq : 



The Trabasso and Bower proof of (2) is brief and appears to be marred by use of 
the expected operator approximation (Sternberg, 1963, pp. h-O-h-7 ) without noting 
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that the expected values just discussed should, after , have been condition- 
alized subject to successes on all trials up to the point of any prediction. It 
seems appropriate to make a more rigorous analysis of the consequences of 
Trabasso and Bower's assumptions. We begin with an examination of specific 
stimulus sequences. 

2. Determination of Response Probabilities for All 
Stimulus Patterns in a Three -Trial Sequence 

The discussion below is dependent upon knowledge of a term from Cotton (in 
press): Congruence (i) is defined as the number of dimensions, including 
the relevant one(s) which is (are) consistent with the relevant dimension(s) 
in changing value (s) from Trial n to Trial n + 1 when the relevant dimen- 
sions) change(s) or in remaining constant when the relevant dimension(s) 
remain(s) constant. (The possibility of two or more redundant relevant 
dimensions is accommodated by the parenthesized s ’s.) 

On the n -th trial of a K - dime nsional binary concept problem there 
Kn 

will be 2 possible branches reflecting different stimulus sequences of 

Kn 

stimuli which may have occurred on the n trials . Though 2 is much too 
many branches to examine explicitly for large n , we can gain useful informa- 
tion by examining a few trials fully in order to determine the possible states 
of a Markov process and transition probabilities presumed to correspond to 
the theory in question. Let us consider an example with K = 3 , one relevant 
dimension ( A, should be followed by R.. and A 2 by Rg ), and with { s 3 = 
(A^B ,C 1 3, one of the acceptable focus samples of size 3 which could follow 
an error on T^ for the stimulus A-B^C. . We assume for the moment that 
every one of the eight possible stimulus patterns is equally likely on each 
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trial, with patterns on pairs of trials having independent probabilities . 

Under this assumption (and some less restrictive ones), complementary 

stimuli A.B.C and A.,B. ,C , , with j f j' , k f k’ , and % ^ a ' all 
J k a j k n, 

simultaneously holding as in the case of A^B^C^ and A^BgCg , have equal 
probabilities of appearance. Furthermore feedback following presentation of 
one member of a complementary pair always confirms the same hypotheses which 
feedback following the other member would confirm. Therefore, in the three- 
dimensional case it will be sufficient to examine stimulus sequences involv- 
ing a choice of four stimuli rather than eight. Table 1 shows the possible 
sequences based on A^B-jC^ , A^B^Cg , A^B^C^ , and A^gCg , together with 

Insert Table 1 about here 

congruence (i) values, the probability of a correct response (Pr) with 
each stimulus at each stage, and the (s) values resulting from examining 
{ s } after each success and excluding, any dimension value which could have 
led to an error on that trial. The reader may simply assume that one -half 
of the events attributed to any stimulus are actually associated with its 
complementary stimulus. 

To read Table 1 easily, one should learn that congruence values (i) for 
Trial 1 are represented by Roman numerals I, II and III, when cases are deline- 
ated on subsequent trials. Case IIA and Case IIB differentiate i = 2 cases 
which involve different stimuli yielding different focus samples. One or two 
dots following a numerical specification of a case indicates that one or two 
final trials, respectively, may be ignored as to specific stimulus history 
because the final focus sample will be independent of that history. Thus for 
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Case III 1* all focus samples include only the relevant cue following an 
i = 3 trial and an i = 1 trial in that order. 

Once Table 1 is known, we can use our assumption of equally frequent 
stimuli to predict Pc^S-Je^) with the following equation: 

Pr(S 1 |E Q ) = Z wt (Pr 1 ) (3) 

seq 

l 

where wt is the weight or probability of being in a certain sequence , Pr n 
is the probability of a correct response on Trial n (T n ) for that sequence, 
and the summation is over all sequences . It will be useful to call the right- 
hand side of (3) by the name Z Prod^ and to define: 

Z Prod , = Z Prod (Pr , , ) 
n+1 n ' n+1 

= Z wt Pr, Pr_. . . Pr 

1 2 n+1 



= f*< s „ +1 ...Si|E 0 ) ■ (>>) 

Note that Pr^ times wt times Pr^ is the probability of having suc- 
cesses on both T^ and Tg during a certain stimulus sequence, It might 
seem reasonable to let Z Prodg = Z Pr^ wt (Pr^) define the probability of 
two successes in a row after an error without further manipulation. However, 
the experimental design in question is one in which data on T r , are not 
analyzed for subjects making an error on T^ . Therefore, we must take into 
account the number of subjects remaining for analysis on T g > i»e., Pr(S.jE. ) 
or Z Prod . Since 

and 



8 -ix) . Er ^ S M-l"' S l |E 0 ) 
n 10 . Pr(S n ...s 1 jE 0 ) 



( 5 ) 
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£ Prod 



n+1 



^nnlV-^-TSoa- 

n 



(6) 



Table 2 presents the calculations of R*(S S n « . .S^Eq) for each trial 



Insert Table 2 about here 



of the example analyzed in Table 1. Once we determine the value of Trabasso 
and Bower's p , we can check Table 2 results against (2). First, we empha- 
size that p is not a response property as in Bower and Trabasso (1964); 
rather, as the first quotation in this article implies, it is wholly defined 
once the stimulus probabilities and the reinforcement rule are known. If 
every irrelevant cue, such as B^ , is exactly as likely to be paired with 
as with Ag in our example, then p = ■§■ . But our assumption of equal 
probabilities for each possible stimulus pattern assures this equality. 
Therefore (2) should hold, yielding the same probabilities as obtained in 
Table 2. It does. 

3- A Matrix Formulation of the Focus Sample Problem 

Examination of Table 1 suggests that a useful representation of the 
process un der study will result from classification into seven states, with 
a revised organization leading eventually to four states. The seven states 
are 1C (the probability of being correct is 1 and all cues in the focus 
sample are correct); 1U3 (the probability of being correct is 1, but there 
are three cues in the focus sample, not all of which are correct); 1U2 (like 
1U5 but with two cues, not both of which are correct); States 2/3> l/2, and 
l/3 having probabilities of being correct given by their designations; and 
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the dropout State (D) having zero probability of a correct response because 
the subject has made an error since T^ and is therefore excluded from future 
analyses . 

It is easy to identify which state will be operative after a given 
stimulus sequence by looking at a case number in Table 1 and examining the 
probability values and [ s } entries. Consider Trial 2, Case III: For i 

values of 3j 2 , and 1 a person is in 1U3 j 2/3, and 1/3., respectively. Persons 
making errors on Trial 2 because they are in States 2/3 or l/3 will go into 
State D on Trial 3 and stay there thereafter. However., persons who are cor- 
rect on Trial 2 when in State 2/3 will go into State 1U2 or State l/2 on Trial 
3 ; depending upon whether the two cues remaining in their focus sample are 
consistent or inconsistent with the next stimulus presented. Persons who are 
correct when in State l/3 on Trial 2 will go into State 1C on Trial 3 since 
Table 1 shows that only will remain in their focus sample. 

Rather than present a matrix for these seven states, we first expand 
to 10 states by distinguishing between success (S). and error (E) substates 
for the three states having fractional probabilities of a correct response. 
This, together with examinations of probabilities of reaching various points 
in Table 3, yields the following initial vector: 

1C 1U3 1U2 (2/3 OS (^3)E (-1/2)8. (l/2)E (l/3)S (1/3)E D 
p* = to 1/40 1/3 1/6 o o 1/12 1/6 o] ( 7 ) 
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and transition matrix: 

1C 1U3 1U2 (2/3)S (2/3)E (l/2)S (l/2)E (l/3)S (l/3)E D 




0 

1/6 

0 

0 

0 

0 

0 

0 

0 

0 



0 

0 

0 

0 

1 

0 

1 

0 

1 

1 



(8) 



Note that rows 1C, (l/2)S, and (l/3)S of this transition matrix are identical; 
also rows 1U2 and (2/3 )S; also rows (2/3)E, (l/2)E, (l/3)E, and D. By Burke and 
Rosenblatt's (1958) Corollary 1 we can lump states having such identical rows 
together, yielding the following 4-state model: 

C U3 U2 E 

P 1 = [l/l2 1/4 1/3 l/3] (9) 

and 
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where C implies that a subject will be correct with probability 1 hereafter; 
U3 means that a subject will be correct on the current trial but is still 
unconditioned in that at least one of the cues in the focus sample is irrele- 
vant; U2 means that a subject will be correct on the current trial but that 
one of the two cues in the focus sample is irrelevant, and E means either 
that an error will be made on the current trial or that the subject involved 
has already dropped out of the analysis because of a previous error. The 
proportion of subjects in the two sources of the E state can be determined by 
finding the difference between the proportions in E on Trials n and n - 1 ; 
the difference is the proportion of errors (out of all subjects) on Trial n . 

We must now find an expression for R n in order to obtain explicit 
trial by trial predictions based on the well known relation: 




n+1 




( 11 ) 



A method from Goldberg (1958, pp. 229-231, and exercises 10 and 11, pp. 244- 
245) leads us first to find the characteristic roots of R from (9) by solv- 
ing the following determinantal equation: 

f(A) = |'R - XI I V o / (12) 
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where I is an identity matrix, obtaining A^ = 1 , A^ = l/^ , A^ = l/2 , 
and A^ = 1 . The Cay ley -Hamilton theorem asserts that if f(A) = 0 as 
required by (12), then f(R) , using the same constants as in (12) but 
replacing powers of A by corresponding powers of R , will equal the null 
matrix. In the present example, each equation will be a polynomial of the 
fourth degree. 

Now it is possible to write A n in the form: 

A n = f (A) q(A) + r(A) (13) 



where q(A) is of degree n - 4 since f(A) has degree ^ and r(A) has 
at most degree 5* else r(A) could be factored by f(A) . Goldberg cites 
a proof that the corresponding matrix equation holds as a consequence of (13): 

R n = f (R) q(R) + r(R) . (14) 



Invoking the conditions defined by (12) and by the Cayley-Hamilton theorem 
yields : 



A n = r(A) 



from (13) and f(A) = 0 



2 3 

= a Q + a^A + a £ A + , 



(15) 



and 



R n = r(R) from (l^) and f(R) = 0 



2 3 

= a^I + a^R + a^R + a^R 



( 16 ) 



since r(A) and r(R) must be of degree 3 or less. 
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We must now solve for the coefficients from ( 15 ) and apply them to ( 16 ). 
A slight complication arises' because A^ and A^ are equal, yielding three 
independent equations, rather than four, from (15)- Therefore, we differenti 
ate both sides of ( 15 ) with respect to A , for A^ = 1 : 



^ + 2a 2 \ + 5a 5 A 4 



(17) 



Substituting the values of A^ through A^ in (.15) or (17); as appropriate, 
yields the following system of equations : 



and 



1 ~ a Q + a^ + a^ + a^ 



(l/4) n = a Q + + a 2 /l6 + a^/64 

(l/2) n = a Q + a x /2 + a 2 /4 + a^/8 



n = a, + 2a_ + 3a, 
1 2 5 



(18) 

(19) 

( 20 ) 

( 21 ) 



which can also be expressed in matrix form: 



(c) = C(a) 



( 22 ) 



where (c) is the column vector on the left hand side of the set of 

equations, C is the matrix of coefficients of the a. 1 s, and (a) is a 

3 

column vector of a. *s. C is nonsingular; therefore, (22) implies: 



(a) = C" X (c) 



(23) 



Now inverting C yields: 
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-1 



c = (1/9) 



13 


32 


-36 


-3 


-88 


-128 


216 


21 


164 


160 


-324 


-42 


-80 


-64 


144 


24 



(24) 



from which (a) has been computed using (23) and has values equal to the 
coefficients of the R n terms below based on (l6 ) : 

R n = ( 1/9 ) [ 13 + 32(1/4 ) n - 36(i) n - 3n]l + (l/9)[-88 - l£6(l/4) n + 2l6(f) n + 21n]R 

+ (l/9)[l64 +l6o(V4) n - 324(^) n - 42n]R 2 h- (l/9)C-8o - 64(l/4) n + l44(|) n + 24n]R 5 

(25) 

2 3 

We now need values of R and R , so that (25) may be applied. By direct 
calculation, from (10 ), 



U3 U2 E 



and 



R = 



C 

U3 

U2 

E 



R = 



C 

U3 

U2 

E 



1 0.00 
9/48 l/l 6 1/4 l/2 

3/8 0 1/4 3/8 

0 0 0 1 

C U3 U2 E 



1 0 0 0 

49/192 1/64 7/48 7/12 
7/16 0 1/8 7/16 

0 0 0 1 



(26) 



(27) 
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For present purposes it is sufficient to calculate the last column of R n , 
which will he called (R n )^ • Use of (10), (26), and (27) in (25) leads, 
after simplification, to: 



(A - 



c 

U5 

U2 

E 



E 

0 



(s/3)(i - r) 
(*).(i - ir n ) 



( 28 ) 



s 



O 



We know from (11)' that- the probability of being in State E on T , is given 
by: 

p (W - VA 

= 2/3 - (l/3)i n by (9) and ( 28 ). ( 29 ) 

But the probability of a success on T n+ -]_ is • 

a ( s «i> - 1 - p Ai> 

= (l/5)(l + i n ) from ( 29 ). (50) 

We have just found the probability of a success on , computed from 

among all subjects who made an error on . To make this probability condi- 
tional upon having been tested on T , , we note that we are dealing only 

rH- J_ 

with those subjects who were successful on through T^ and then were 

also successful on T , . Therefore, 
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Pr(S Js . 
v n+l'n 



• s iV - 






l + 



in 

.2 



. in- 
+ 2 



from (30). 



(31) 



But (31) is equivalent to (2) for x = 1 , s = 3 , and p = -g , the condi- 
tions operative in our example' Thus (2) has been verified for the focus 
sample of Table 1 and equiprobable, independent stimuli. 

Extension of the Matrix Formulation to New Focus Samples but the Same 
Experiment 

Trabasso and Bower (1968, pp. 59-6o) assume that a subject selects a 
focus sample by a replacement sampling method in which any one of the K 
different dimensions has a specific probability of being selected as the first 
member of the sample, and the same, independent probability of being selected 
as the second, third, ... or K -th member of the sample. Consequently a focus 
sample of size s will have from 0 to s elements from any particular dimen- 
sion. The three-dimensional binary task with s = 3 which we have been con- 
sidering has 10 distinct focus samples, ignoring order, and 27 samples when 
order is considered. (Other focus samples would be possible if the st imulu s on 
T q were different. See Sec. 4.) Table 3 lists the 10 basic focus samples. 

Insert Table 3 about here 



Sample 10, , has already been investigated above. Hopefully a 

single matrix proof could be developed for (2) which would hold for all 10 
samples. Unfortunately Table 3 shows that the rank of the transition matrix 
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varies frcm 1 to k in the 10 samples under consideration, (in each case the 
rank is also equal to the dimensionality of the matrix. ) Therefore , we have 
determined an initial vector P^ and the matrix R for each asterisked 
sample of Table 5.? determined the form of R n , and verified that in each 
case (2) follows from use of P^ and R n in (11). By symmetry, (2) also 
holds for each unasterisked sample. 

Once (2) or some other equation is known to hold for a focus sample and 
all possible focus samples have been investigated as above (with the possibil 
ity of some samples conforming to different equations or even different forms 
of equations ) , the probability of solution of the problem can be determined 
for each focus sample using (2.2) and the sentence following from Trabasso 
and Bower (1968, p. 56) and a weighted average probability of solution can 
be obtained from their (2.5) and (2.4) once one makes a saliency assumption, 
i.e. , specifies the probability of selecting each dimension for use in the 
focus sample. An equal saliency assumption will, of course, make each of the 
27 permutations of Table 5 equally likely. 

How Many Trials Must be Examined to Identify the Different States for a 
Problem with a Specific Focus Sample When s and K Are Large? 

The matrix method just presented would be inconvenient if it were neces- 
sary to consider all possible stimulus sequences, and consequent focus samples 
in a series longer than the three trials examined above. Suppose K is very 
large, perhaps 15, and s is even larger, perhaps 20, implying that at least 
one dimension is represented more than once in the original focus sample. 

Will this make it necessary to examine more than three trials? 
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The query just posed may be answered by noting, first, that use of all 
possible K -dimensional stimuli (excluding complements if desired) on Trial 
1 will ensure that all possible combinations of dimensions are retained by 
various subjects at the end of that trial, excluding possibilities in which 
the relevant dimension was represented on Trial 1 but not afterward. Thus 
there will be 1-tuples, 2- tuples, ... K- tuples represented in new focus samples, 
with the label on a -tuple identifying the number of dimensions represented in 
a sample, not the number of elements. Because starting with a multiple repre- 
sentation of any dimension can be followed only by keeping all representatives 
of the dimension or discarding all representatives, no new combinations of 
dimensions can be produced after Trial 2. But use of all possible stimuli on 
Trial 2 does enlarge the set of different {s} values by producing all pos- 
sible consequences on any specific -tuple. Consequently Trial 3 will always 
include all possible { s } values provided that all possible stimuli were 
presented on Trial 1 and independently on Trial 2 as well. 



4-. The Case of Constant Partial Relevance, and 
Constant Predictability with Pr / . 5 



Suppose that, in the example given in Table 1, the four stimuli A^B^C^ , 
A 1 B 1 C 2 * A i B 2 C i > ancL'AjBgCg , were assigned the probabilities .36, .2s, .24-, 
and .16 respectively, yielding Pr(B^|A^) = Pr(C-jA^) = .6 , so that the 
partial relevance, p , was constant at .6. [Since the numbering system for 
B, , Bg , C, , and Cg is arbitrary, reversal of numbers for B 1 and Bg 
and for C^ and Cg would have yielded p - 1 - .60 = .4;0 for each irrelevant 
dimension. We adopt the convention of numbering each irrelevant dimension's 
values so as to maximize each partial relevance, Rr(B-jA^) , Er(C^jA^) , 
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etc.] Then Table 2 would require a new row of weight values, yielding differ- 
ent values for ^(S-JEq) and related quantities . The new predictions would 
conform to Eq. 2, showing another case in which Trabasso and Bower's equations 
hold logically. 

Fisher (in press) has shown that in general if the partially relevant 
hypotheses in the focus sample are divided "into groups according to their 
probability of producing a correct response (group i will have h. elements 
each of which is associated with correct responding with a probability p^)," 



Er < 3 1 ! E o> = 



x + Zp.h. 

x 

x + Eh. 

l 



x + Zp.h. 

— (stx) ' (52) 

which reduces to our Eq. 1 if p = ■§ . Note that two dimensions, B and C , 
might have the same partial relevance, p , and yet have hypotheses with the 
same partial relevance (p^ = p^ = p for {A- l ,B^,C^} or p^ = p^ = 1 - p 
for {A^B^Cg} or different partial relevances (p^ = p , p^ = 1 - p for 
(A^,B^,Cg} ). Note also that the i of Eq. 32 is not the congruence value, 
i , discussed earlier. 

If P]_ = Pg = P" > ^9* ^2 also reduces to Eq. 1, increasing the number of 
cases in which Trabasso and Bower’s conclusions hold. Eq. 2 will also hold 
in this case, as well as when ’ p = ■§■ . 

A case in which Eq. 32 must be employed is easily illustrated by letting 
the stimulus for Tq from Table 1 be A^B^Cg . Since an error was made, 
one acceptable focus sample is {A ,B ,C } . An analogue of Table 2 (not 

‘ • — : X> -L - c. ' . ....... . • _ . 

presented) shows that Pr(S 1 |E Q ) = .667 , . Pr(S 2 |S,E 0 ) = .759 , and 
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Pr ^ S 3^ S 2 S l E 0^ = » for a se( l uence of trials beginning with this focus 

sample and = .6 , p^ = .4 based on the stimulus probabilities discussed 
at the beginning of Sec. 4. In contrast Trabasso and Bower [our Eq. (2) with 
p = .6] would have predicted Pr(S |Eq) = .733 , Pr(S 2 |S^Eg) = ' » an< ^ 

Pr(S 3 |S 2 S 1 E Q ) = .833) . 

Fisher’s Sec. D gives the result: 

Pr(E n + l S n--' S ll E 0> = £ P“( p -Pi> IT !33) 

where p^ and h. are defined as in (32). Eq. (33) can be used to determine 

Pr(S n+1 S n . . .S^JEq) or its equivalent from (4), E Prod n+1 : 

Pr(S „ + l S n--' S ll E 0 ) + Pr(E n + l S n-" S ll E 0 ) ‘ Pr Vn-r ' ' S 1 1 E 0 > i3k) 

by elementary probability theory. Combining (4) and (34) yields: 

£ Prod ,, = I Prod - Pr(E ,.S ...S |E ) . (35) 

n+1 n n+1 n l 1 0 

Since E Prod^ = Pr(S 1 |E Q ) from (3) and the discussion following it, 

(32), (33), and (35) permit a recursion to be performed in order to determine 
the quantities required to apply (6) for any n . 

The method just described may also be applied to the example with a 
focus sample of (A ,B£,Cg} .The two h. are each unity, p 1 (for the B 
variable) is .6, and p 0 (for the C variable) is .4. Equation 33 yields 
Pr(E 2 S 1 |E Q ) = .160 and pHE^S-J E Q ) = .080. Equation 32 yields Pr(S 1 |E Q ) = 
.667 ; and Eqs. (6) and (35) then imply Pr(S 2 |S 1 E Q ) = .760 and pHS^S^Eq) = 

842 , - these predictions-always ^eing-wIthin-^001 of those reported before : 

for an analogue of Table 2. / 
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5 • Further Empirical Implications of the Trabasso-Bower Multiple Look Model 
Redundant Relevant Dimensions 

Trabasso and Bower developed their model for the specific purpose of 
treating behavior in the presence of redundant relevant dimensions. The fore- 
going analyses are in no way changed if we assume k redundant relevant 
dimensions so that (A^ ) is always also accompanied by A^ (Ag),... 

A^ -1 )' [A^ k_1 ^] . There is no special advantage in discriminating which of 
the x relevant cues in the initial focus sample comes from each relevant 
dimension, so we may as well call them all A^ as in Table 1. Any effect of 
having relevant redundant x dimensions will be reflected in modifications 
of the probabilities of the different initial focus samples of size s . Thus 
for Table 5, equal salience and a single relevant dimension would yield proba- 
bilities of l/3 for each cue to be sampled. Equal salience and k redundant 
relevant dimensions would yield probabilities of l/(k + 2) for each of the 
two irrelevant cues and k/(k + 2) for each of the redundant relevant cues to 
be sampled. Note that each i value in Table 1 is increased by (k - l) if 
there are k redundant relevant cues. 

Specific Stimulus Sequences 

Each of the columns of Table 2 has Pr^ and Pr^ values giving the 
probability' of a success on T.. and a subsequent success on T 2 for specific 
stimulus values presented in sequence, as well as Pr^ values giving the. 
average probability of success on T ^ . following the sequence of T^ and Tg , 
conditional on success on both previous trials. Tables 1 and 2 could be 
expanded for larger n in order to treat longer stimulus sequences. However, 
a more convenient method is to find a sequence, of matrices comparable to that 



of (10), with each one appropriate to the stimulus on a certain trial, applying 
them in series : 



P n+1 Wt, 



R„ 



n 



( 36 ) 



where R^, is the transition matrix appropriate to the stimulus change from 
3 

T . to T . n . This method of prediction is illustrated in detail in Cotton 
3 3 + 1 

(in press), using a single-look model. 

A very severe test of the present model is suggested by examination of 
Table 1 for congruence values (i) of 1: First, consider the case in which 

x > 1 . Among all subjects who erred on T Q and had i = 1 in Table 1 (or 
had i = the number of relevant dimensions for a more general case ) on T^ and 
who were successful on T^ , none will keep an irrelevant cue on Tg because 
no irrelevant cue placed in the sample focus on Tq can be consistent with the 
relevant cue(s) on T^ , by the definition of congruence. Thus none of the 
subjects with this history will ever again make an error on this problem. 

Now consider the case in which x = 0 . For example , let the stimulus on 
T q be A-jB C and (s) = {B^B-^C.^ Be the focus sample selected to be con- 
sistent with reinforcement of R^ on Tq , with A being the relevant dimen- 
sion. On T. , for which i = 1 , the B and C dimension values on 
will both be inconsistent with the value of the A dimension on T 1 . There- 
fore, the probability of a correct answer on Th will be zero. This conclusion 
holds for any case in which x = 0 Consequently, all subjects who err on T Q , 
have i = the number of relevant dimensions on T^ , and are successful on T^ , 
will be errorless ever after, according to the multiple-look model . This implica- 
tion can be expanded to permit the i = 1 successful trial to occur after T-^ ; 
we do not examine the logic of that case here. Failure of this prediction 



is equivalent to failure of a strict "local consistency" theory (Gregg & Simon, 
1967). 

We do not know of a published set of data bearing upon this prediction. 
'However, iyie (1969) performed two experiments in which his Group 1 had 
i = 1 on Trial 2 and on all subsequent trials except those numbered with 
multiples of 5 • Raw data kindly provided by Pyle show that in Experiment 1 
only 12 of 18 subjects with a success on an i = 1 trial following an error 
made no further errors. The corresponding result for Experiment 2 was 20 out 
of 31. 

Cotton and Rhone (1970) have performed an experiment in which Group 1 has 
the same i values as in Pyle's two Group l's. Among 23 subjects in Cotton 
and Rhone's Group 1, 18 had an error on Tq and a success on T^ , for some 
arbitrary T^ not equal to a multiple of 5- Of these l8, 9 made no further 
errors in the 24-trial sequence given all subjects. Thus 9 out of l8 subjects 
exhibited behavior flatly contradicting the strong prediction just derived from 
the multiple -look model. 

It is easy to show that the prediction of errorless performance once a 
correct response is given with i = number of relevant dimensions (assuming an 
error on the previous trial) also follows from Trabasso and Bower's (1968, pp. 
219-226) modified multiple-look model. That model assumes that, following a 
correct response, the subject has probability b of excluding inappropriate 
hypotheses on the same basis as the original model and probability 1 - b of 
excluding them but resampling from locally consistent hypotheses in order to 
keep the size of s constant. The hypothesis which has been in the focus 
sample for the greatest number of correct trials is called the dominant hypoth- 
esis and will control the response on any given trial. Since the i = number 
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of relevant dimensions condition assures that the correct hypothesis or hypoth- 
eses -will he the only one(s) in (s) on T-^ which were also confirmed on the 
error trial T Q , previous to resampling, the correct hypothesis or hypotheses 
will he the only one(s) in (s) on which were also confirmed on the 

error trial T^ , previous to resampling, the correct hypothesis or hypotheses 
will he the dominant one(s) on the next trial, will again he confirmed and 
still he dominant, etc., assuring no subsequent errors. 



Prediction of the Distribution of Runs of All Successes or All Errors 

Trahasso and Bower (1968, pp. 55-56) derive an equation for the probability 
j of a run of n successive successes following an error: Er(H = n) = (l - p)p n • 

i 

f For all focus samples for which Eq. 2 holds, Trahasso and Bower's formula for 

| Pr(H = n) stands as given. Since this formula does not depend directly upon 

! 

i either x or s , a subject could shift from one acceptable focus sample to 

another following each error (as he is assumed to do by the theory) and yet 
the same equation would hold throughout his session, permitting calculation of 
a variety of run statistics such as those presented in Bower and Trahasso (1964) 
for a single-look model. We emphasize a point inherent in Trahasso and Bower's 
discussion: The case s - x = 0 is acceptable for a focus sample because it 

'> will produce learning, making Er(H = °°) = 1 at the end of the experiment. 

However, this serves to emphasize that the learning parameter, x/s , defined 
in their (2.2), is most assuredly not constant within a session for a single s 
but rather ranges from 0 when x = 0 to 1 when x = s . 

For the general case, Fisher (in press) has shown that Pr(H = n) = 

— — [2p^(l - p^)lu) , using the same notation as in Eq. 52. This equation 
• i 

reduces to the Trahasso and Bower result for any case in which Eqs. 1 and 2 
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hold. A final thought about this general case: One error may occur in 

response to , as in Table 1 ; the next error may occur in response to 

AjB^Cg as in our later example, so that Er(H = n) must be computed separ- 
ately for each case because the p^ values will not be constant throughout 
the experiment even though the partial relevance of any cue is constant. 
Introduction of sampling schemes for focus samples, as in Trabasso and Bower 
(1968, pp. 57-60) must receive careful mathematical analysis since this 
problem of shifting p^ values has not previously been noted. 

6. Summary 

Two methods of deriving predictions for the Trabasso and Bower multiple- 
look concept identification model have been examined. A method of directly 
calculating the effects of every possible stimulus sequence is practical only 
for small numbers of trials ' and must be used separately for each possible 
focus sample of a given size. However, it can be employed for cases of par- 
tially relevant cues, redundant relevant cues, and a single stimulus sequence 
for all subjects. This method reveals a very, strong implication of the model: 
Among subjects who make an error on some trial T^ and who are correct on the 
immediately subsequent trial for which the congruence must be equal to the 
number of relevant dimensions, there will be no further errors. Existing data 
on this point contradicts the theory. 

A matrix method of proof is applicable for all trial numbers and is other- 
wise comparable to the first method. Use of the first method for three trials 
will normally be necessary to determine the appropriate transition matrix, 
which varies from one focus sample to another. 
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This paper also discusses Fisher’s demonstration that certain Trahasso 
and Bower equations sometimes fail to hold if P 4 *5° • Her conclusion 
are shown to imply a general procedure for calculating the probability of a 
success on Trial n given success on all previous trials since an error on 

Trial 0. 
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Table 1 

An Examination of Stimulus Patterns, Congruence Values (i) , Success Probabilities (Pr) 
Conditional on Stimulus Patterns, and Attendant Focus Samples {s} , with s = 3 
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Table 3 

Listing of All Possible Initial Focus Samples in a Three-Dimensional 
Problem -with s = 3 , S = on T Q , and Dimension "A" Relevant 



Focus 

Sample 


{ s } 


X 


No. 

Permutations 


Rank of 
R Matrix 


1* 


{A 1 ,A 1 ,A 1 ) 


3 


1 


1 


2* 




0 


1 


2 


3 


< C l' C l' C l) 


0 


1 


2 


If* 


{B 1 ,B 1 ,C 1 ) 


0 


3 


k 


5 


< B l' C l' C l } • 


0 


3 


b 


6* 


^ A l ,A l ,B l) 


2 


3 


3 


7 


{A i' A r c i } 


2 


3 


3 


8* 


{A i ;B i ;B i J 


1 


3 


3 


9 


^ A i ,c i ,c i^ 


1 


3 


3 


10* 




1 


6 

Sum = 27 


4 



*By symmetry, any unasterisked sample behaves like the 
asterisked sample above it. Only asterisked samples were 
explicitly investigated. 
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